## [1] "cmte_id" "cand_id" "cand_nm"
## [4] "contbr_nm" "contbr_city" "contbr_st"
## [7] "contbr_zip" "contbr_zip_5" "contbr_employer"
## [10] "contbr_occupation" "contb_receipt_amt" "contb_receipt_dt"
## [13] "receipt_desc" "memo_cd" "memo_text"
## [16] "form_tp" "file_num" "tran_id"
## [19] "election_tp"
## cont$cand_nm: Bush, Jeb
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5400 100 500 1167 2700 15000
## --------------------------------------------------------
## cont$cand_nm: Carson, Benjamin S.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -7100.0 25.0 50.0 153.3 100.0 10000.0
## --------------------------------------------------------
## cont$cand_nm: Christie, Christopher J.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700 100 1350 1444 2700 5400
## --------------------------------------------------------
## cont$cand_nm: Clinton, Hillary Rodham
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5400.0 25.0 51.0 438.6 250.0 5400.0
## --------------------------------------------------------
## cont$cand_nm: Cruz, Rafael Edward 'Ted'
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5450 35 50 251 100 10800
## --------------------------------------------------------
## cont$cand_nm: Fiorina, Carly
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 50.0 100.0 259.8 250.0 5000.0
## --------------------------------------------------------
## cont$cand_nm: Graham, Lindsey O.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 200.0 500.0 941.3 1500.0 5400.0
## --------------------------------------------------------
## cont$cand_nm: Huckabee, Mike
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 43.0 100.0 452.4 500.0 5400.0
## --------------------------------------------------------
## cont$cand_nm: Jindal, Bobby
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 100.0 250.0 778.6 1375.0 2700.0
## --------------------------------------------------------
## cont$cand_nm: Kasich, John R.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -1000.0 100.0 250.0 835.2 1000.0 2700.0
## --------------------------------------------------------
## cont$cand_nm: Lessig, Lawrence
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 27.94 137.50 250.00 633.70 500.00 2700.00
## --------------------------------------------------------
## cont$cand_nm: O'Malley, Martin Joseph
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 100.0 250.0 656.2 750.0 5400.0
## --------------------------------------------------------
## cont$cand_nm: Pataki, George E.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 250.0 250.0 1000.0 666.7 1000.0 1000.0
## --------------------------------------------------------
## cont$cand_nm: Paul, Rand
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 25.0 50.0 234.6 201.6 5000.0
## --------------------------------------------------------
## cont$cand_nm: Perry, James R. (Rick)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700 250 1000 1305 2700 2700
## --------------------------------------------------------
## cont$cand_nm: Rubio, Marco
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -5400.0 50.0 100.0 472.8 500.0 10800.0
## --------------------------------------------------------
## cont$cand_nm: Sanders, Bernard
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3000 10 35 69 50 5000
## --------------------------------------------------------
## cont$cand_nm: Santorum, Richard J.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700 100 500 1066 2700 5400
## --------------------------------------------------------
## cont$cand_nm: Stein, Jill
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 20.0 103.0 250.0 186.7 250.0 250.0
## --------------------------------------------------------
## cont$cand_nm: Trump, Donald J.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 100.0 250.0 352.7 276.1 5400.0
## --------------------------------------------------------
## cont$cand_nm: Walker, Scott
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2700.0 250.0 500.0 811.6 1000.0 5400.0
## --------------------------------------------------------
## cont$cand_nm: Webb, James Henry Jr.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 20.0 100.0 500.0 545.8 500.0 2700.0
## cont$cand_nm: Bush, Jeb
## [1] 1166.82
## --------------------------------------------------------
## cont$cand_nm: Carson, Benjamin S.
## [1] 153.3212
## --------------------------------------------------------
## cont$cand_nm: Christie, Christopher J.
## [1] 1444.185
## --------------------------------------------------------
## cont$cand_nm: Clinton, Hillary Rodham
## [1] 438.6258
## --------------------------------------------------------
## cont$cand_nm: Cruz, Rafael Edward 'Ted'
## [1] 250.9629
## --------------------------------------------------------
## cont$cand_nm: Fiorina, Carly
## [1] 259.7782
## --------------------------------------------------------
## cont$cand_nm: Graham, Lindsey O.
## [1] 941.3373
## --------------------------------------------------------
## cont$cand_nm: Huckabee, Mike
## [1] 452.397
## --------------------------------------------------------
## cont$cand_nm: Jindal, Bobby
## [1] 778.6207
## --------------------------------------------------------
## cont$cand_nm: Kasich, John R.
## [1] 835.159
## --------------------------------------------------------
## cont$cand_nm: Lessig, Lawrence
## [1] 633.6856
## --------------------------------------------------------
## cont$cand_nm: O'Malley, Martin Joseph
## [1] 656.2477
## --------------------------------------------------------
## cont$cand_nm: Pataki, George E.
## [1] 666.6667
## --------------------------------------------------------
## cont$cand_nm: Paul, Rand
## [1] 234.6485
## --------------------------------------------------------
## cont$cand_nm: Perry, James R. (Rick)
## [1] 1305.47
## --------------------------------------------------------
## cont$cand_nm: Rubio, Marco
## [1] 472.8235
## --------------------------------------------------------
## cont$cand_nm: Sanders, Bernard
## [1] 69.00485
## --------------------------------------------------------
## cont$cand_nm: Santorum, Richard J.
## [1] 1065.83
## --------------------------------------------------------
## cont$cand_nm: Stein, Jill
## [1] 186.7143
## --------------------------------------------------------
## cont$cand_nm: Trump, Donald J.
## [1] 352.6604
## --------------------------------------------------------
## cont$cand_nm: Walker, Scott
## [1] 811.5589
## --------------------------------------------------------
## cont$cand_nm: Webb, James Henry Jr.
## [1] 545.8064
## [1] 124763 19
## Warning: Removed 9072 rows containing non-finite values (stat_bin).
## Warning: Removed 2 rows containing missing values (geom_bar).
## Warning: Removed 15094 rows containing non-finite values (stat_bin).
## Warning: Removed 2 rows containing missing values (geom_bar).
dim(cont)
## [1] 124763 19
Contributions in terms of size and number of total contributions by candidate.
City of contributor and zipcode of contributor I believe will support the investigation as it is my hypothesis that cities and even zipcodes will offer particular political profiles.
I created a few filters to visualize the data a bit easier. For one I created a list of leading candidates in the overall race as well as the top 5 and bottom 5 populated cities in Texas. I then created a new dataframe filtered by these lists for analyses in my Bivariate and Multivariate sections below.
The contribution amount seemed to follow particular denomonations (i.e. $10, $25, $50, $100) rather than form a fluid distribution. In addition there were a number of negative values. Upon further investigation into the dataset, the negative contributions were due to contributors reposting the donated amount in a spouse’s name, reallocating the donation to the general party or changing the donated amount.
Given the high number of candidates I subsetted the data on leading candidates and plan on using that particular data as I perform more granular searches i.e. voting by zipcode.
## Warning: Removed 2550 rows containing non-finite values (stat_boxplot).
## Warning: Removed 2550 rows containing missing values (geom_point).
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:memisc':
##
## collect, query, rename
## The following object is masked from 'package:MASS':
##
## select
## The following object is masked from 'package:GGally':
##
## nasa
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
The median donated amounts per candidate did not vary wildly however Chris Christie showed the highest median contribution amount at $1350, while Ben Carson, Hillary Clinton, Ted Cruz, Rand Paul, and Bernie Sanders had mean donated amounts around $50. This finding can be misleading as the candidates with the most contributions in Texas are Ted Cruz (61k), Ben Carson (19k), Bernie Sanders (14k) and Hillary Clinton (13k), while the remaining candidates have less than 5000 contributions each.
This illuminates number of contributions and median contributions and as such while Ted Cruz’s donations have a low median dollar amount, he has the largest number of contributions. This begs the question how much each candidate has raised thus far. Not surprisingly, Ted Cruz’s home state has raised over $15MM with Hillary Clinton in a distant second place with roughly $6MM raised. The majority of candidates did not raise more than $1MM including Bernie Sanders who as shown above has roughly 14000 contributions, slightly more than Hillary Clinton and behind Ben Carson and Ted Cruz.
Further investigation shows that Bernie Sanders has the lowest contribution by contributor ratio, followed closely by Ben Carson. Jeb Bush, Chris Christie, Rick Perry and Rick Santorum all have contribution by contributor ratios greater than $1000.
A couple interesting findings were 1) Donald Trump’s fundraising is almost non-existant in Texas and 2) Bernie Sanders’s fundraising is greater than I imagined it would be in Texas.
The clearest relationship is the both the number of contributions by city (bigger city, more contributions) and number of contributions by candidate (Ted Cruz gathering the most).
By looking at just the top 5 and bottom 5 cities in Texas by population, and looking at the current leading candidates, some interesting trends emerge. For one, it is quite clear that Ted Cruz dominates the contribution scene across Texas cities.
While Austin pulls more Bernie Sanders and Hillary Clinton contributions than other cities, Houston has a sizable amount of Bernie Sanders contributions, second only to Austin. Size of city certainly seems to impact the number and size of contributions (not surprisingly). Upon looking deeper into the mix of zipcodes within cities, it becomes clear the contribution depends on particular zipcodes. Houston’s 77006 for instance has the largest number of Bernie Sanders contributions. That zipcode’s highlights are two prominent art collections and the University of St. Thomas with mean ages between 25-30. Contrast that with zipcode 77024 which sports a mean age of 45-55 outside of the inner loop of Houston proper.
Plot One illustrates the number of contributions to 2016 political candidates in the state of Texas. Ted Cruz garners a significant majority of the contributions at over 60000, four times as many contributions as Bernie Sanders and Hillary Clinton.
Plot Two expands upon Plot One by illustrating contributions to 2016 political candidates in the state of Texas by the five most populated and five least populated cities in the state. The number of candidates was reduced to the current leading candidates in the race.
Plot Three expands upon Plot Two by investigating contributions to leading candidates made in the city of Houston by zipcode.
Ted Cruz dominates Texas. He garners a staggering number of contributions across the state, his home turf. However, his influence is not without some challenge. While Houston favors him more strongly than any other Texas city, Houston’s second and third largest contribution bases are Hillary Clinton and Bernie Sanders, respectively. Additionally, Austin holds the second largest number of contributors in Texas and Bernie Sanders and Hillary Clinton have more contributors in the city than Ted Cruz. But who is contributing? Looking deeper into Houston zipcodes, a split is seen. While Ted Cruz still is a significant player, zipcodes with younger demographics yield higher Bernie Sanders and Hillary Clinton support.
While the macro view shows a domination by Ted Cruz in terms of his contributing base, his lead is driven primarily by two cities (Houston and San Antonio) and even within Houston there are signs that a sizable percentage of youth veers toward Democratic candidates.
And in all of this, where is Trump? Nowhere in any meaningful way. Additionally, Marco Rubio is not well capitalized. Is this due to the assumed Cruz dominance as he represents the state as a Senator? The answer is unclear at this point. Through this analysis however, Ted Cruz’s dominance is not as solid as it has initially appeared to be.
It is interesting to be able to gather quite narrow and specific data with regard to the contributors to political compaigns (and rightly so). Certainly doing additional analyses investigating occupation and political contributions could add richness to the analysis. Also, incorporating demographics directly into the dataset, joining on zipcodes or even street names, could add a layer of detail. This could ultimately yield the basis for a predictive model in which contribution levels can be forecasted with even greater accuracy resulting in better information for both contributors and the parties in which those contributions rest.